Search CORE

212 research outputs found

Binary image representation of a ligand binding site: its application to efficient sampling of a conformational ensemble

Author: Kim Sangsoo
Shin Whanchul
Sung Edon
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Modelling the ligand binding site of a protein is an important component of understanding protein-ligand interactions and is being actively studied. Even if the side chains are restricted to rotamers, a set of commonly-observed low-energy conformations, the exhaustive combinatorial search of ligand binding site conformers is known as NP-hard. Here we propose a new method, ROTAIMAGE, for modelling the plausible conformers for the ligand binding site given a fixed backbone structure. Results ROTAIMAGE includes a procedure of selecting ligand binding site residues, exhaustively searching rotameric conformers, clustering them by dissimilarities in pocket shape, and suggesting a representative conformer per cluster. Prior to the clustering, the list of conformers generated by exhaustive search can be reduced by pruning the conformers that have near identical pocket shapes, which is done using simple bit operations. We tested our approach by modelling the active-site inhibitor binding pockets of matrix metalloproteinase-1 and -13. For both cases, analyzing the conformers based on their pocket shapes substantially reduced the 'computational complexity' (10 to 190 fold). The subsequent clustering revealed that the pocket shapes of both proteins could be grouped into approximately 10 distinct clusters. At this level of clustering, the conformational space spanned by the known crystal structures was well covered. Heatmap analysis identified a few bit blocks that combinatorially dictated the clustering pattern. Using this analytical approach, we demonstrated that each of the bit blocks was associated with a specific pocket residue. Identification of residues that influenced the shape of the pocket is an interesting feature unique to the ROTAIMAGE algorithm. Conclusions ROTAIMAGE is a novel algorithm that was efficient in exploring the conformational space of the ligand binding site. Its ability to identify 'key' pocket residues also provides further insight into conformational flexibility with specific implications for protein-ligand interactions.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Large Magnetic Anisotropy and Magnetostriction in Thin Films of CoV $_2$ O $_4$

Author: Beekman Christianne
Kim Sangsoo
Thompson Christie
Xin Yan
Publication venue: 'American Physical Society (APS)'
Publication date: 13/03/2023
Field of study

Spinel Cobalt Vanadate CoV

_2

_4

has been grown on (001) SrTiO

_3

substrates. Using torque magnetometry experiments, we find that the previously observed temperature induced anisotropy change, where the easy axis changes from the out of plane [001] direction to a biaxial anisotropy with planar easy axes, occurs in a gradual second-order structural phase transition. This work characterizes this transition and the magnetic anisotropies in the (001), (100), and (-110) rotation planes, and explores their field dependence up to 30~T. Below 80~K, hysteretic features appear around the hard axes, i.e., the out-of-plane direction in (-110) and (010) rotations and the planar directions in (001) rotations. This is due to a Zeeman Energy that originates from the lag of the magnetization with respect to the applied magnetic field as the sample is rotated. The appearance of the hysteresis, which persist up to very high fields, shows that the anisotropy at low temperature is rather strong. Additionally, field dependent distortions to the symmetry of the torque response in increasing applied fields shows that magnetostriction plays a large role in determining the direction and magnitude of the anisotropy.Comment: Main text: 9 pages and 6 figures; supplemental materials: 9 pages and 10 figure

arXiv.org e-Print Archive

A methodology for multivariate phenotype-based genome-wide association studies to mine pleiotropic genes

Author: Kim Sangsoo
Lee Ji Young
Park Sung Hee
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

FESD: a Functional Element SNPs Database in human

Author: Choi Kyoung Oak
Kang Hyo Jin
Kim Byung-Dong
Kim Sangsoo
Kim Young Joo
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

We have created the Functional Element SNPs Database (FESD) that categorizes functional elements in human genic regions and provides a set of single nucleotide polymorphisms (SNPs) located within each area. In the FESD, the human genic regions were divided into 10 different functional elements, such as promoter regions, CpG islands, 5′-untranslated regions (5′-UTRs), translation start sites, splice sites, coding exons, introns, translation stop sites, polyadenylation signals and 3′-UTRs, and subsequently, all the known SNPs were assigned to each functional element at their respective position. With the FESD web interface, users can select a set of SNPs in the specific functional elements and get their flanking sequences for genotyping experiments, which will help in finding mutations that contribute to the common and polygenic diseases. A web interface for the FESD is freely available at http://combio.kribb.re.kr/ksnp/resd/

Crossref

PubMed Central

Localizome: a server for identifying transmembrane topologies and TM helices of eukaryotic proteins utilizing domain information

Author: Bhak Jong
Jang Insoo
Kim Sangsoo
Lee Byungwook
Lee Sunghoon
Publication venue: Oxford University Press
Publication date: 14/07/2006
Field of study

The Localizome server predicts the transmembrane (TM) helix number and TM topology of a user-supplied eukaryotic protein and presents the result as an intuitive graphic representation. It utilizes hmmpfam to detect the presence of Pfam domains and a prediction algorithm, Phobius, to predict the TM helices. The results are combined and checked against the TM topology rules stored in a protein domain database called LocaloDom. LocaloDom is a curated database that contains TM topologies and TM helix numbers of known protein domains. It was constructed from Pfam domains combined with Swiss-Prot annotations and Phobius predictions. The Localizome server corrects the combined results of the user sequence to conform to the rules stored in LocaloDom. Compared with other programs, this server showed the highest accuracy for TM topology prediction: for soluble proteins, the accuracy and coverage were 99 and 75%, respectively, while for TM protein domain regions, they were 96 and 68%, respectively. With a graphical representation of TM topology and TM helix positions with the domain units, the Localizome server is a highly accurate and comprehensive information source for subcellular localization for soluble proteins as well as membrane proteins. The Localizome server can be found at

Crossref

PubMed Central

ScholarWorks@UNIST

An Integrative Remote Sensing Application of Stacked Autoencoder for Atmospheric Correction and Cyanobacteria Estimation Using Hyperspectral Imagery

Author: Baek Sangsoo
Cha YoonKyung
Cho Kyung Hwa
Duan Hongtao
Kang Taegu
Kim Kyunghyun
Kim Minjeong
Kwon Yong Sung
Lee Hyuk
Ligaray Mayzonee
Pyo JongCheol
Publication venue: 'MDPI AG'
Publication date: 01/03/2020
Field of study

Hyperspectral image sensing can be used to effectively detect the distribution of harmful cyanobacteria. To accomplish this, physical- and/or model-based simulations have been conducted to perform an atmospheric correction (AC) and an estimation of pigments, including phycocyanin (PC) and chlorophyll-a (Chl-a), in cyanobacteria. However, such simulations were undesirable in certain cases, due to the difficulty of representing dynamically changing aerosol and water vapor in the atmosphere and the optical complexity of inland water. Thus, this study was focused on the development of a deep neural network model for AC and cyanobacteria estimation, without considering the physical formulation. The stacked autoencoder (SAE) network was adopted for the feature extraction and dimensionality reduction of hyperspectral imagery. The artificial neural network (ANN) and support vector regression (SVR) were sequentially applied to achieve AC and estimate cyanobacteria concentrations (i.e., SAE-ANN and SAE-SVR). Further, the ANN and SVR models without SAE were compared with SAE-ANN and SAE-SVR models for the performance evaluations. In terms of AC performance, both SAE-ANN and SAE-SVR displayed reasonable accuracy with the Nash???Sutcliffe efficiency (NSE) > 0.7. For PC and Chl-a estimation, the SAE-ANN model showed the best performance, by yielding NSE values > 0.79 and > 0.77, respectively. SAE, with fine tuning operators, improved the accuracy of the original ANN and SVR estimations, in terms of both AC and cyanobacteria estimation. This is primarily attributed to the high-level feature extraction of SAE, which can represent the spatial features of cyanobacteria. Therefore, this study demonstrated that the deep neural network has a strong potential to realize an integrative remote sensing application

Multidisciplinary Digital Publishing Institute

ScholarWorks@UNIST

Compiling Multicopy Single-Stranded DNA Sequences from Bacterial Genome Sequences

Author: Dongbin Lim
Sangsoo Kim
Wonseok Yoo
Publication venue: 'Korea Genome Organization'
Publication date: 01/03/2016
Field of study

A retron is a bacterial retroelement that encodes an RNA gene and a reverse transcriptase (RT). The former, once transcribed, works as a template primer for reverse transcription by the latter. The resulting DNA is covalently linked to the upstream part of the RNA; this chimera is called multicopy single-stranded DNA (msDNA), which is extrachromosomal DNA found in many bacterial species. Based on the conserved features in the eight known msDNA sequences, we developed a detection method and applied it to scan National Center for Biotechnology Information (NCBI) RefSeq bacterial genome sequences. Among 16,844 bacterial sequences possessing a retron-type RT domain, we identified 48 unique types of msDNA. Currently, the biological role of msDNA is not well understood. Our work will be a useful tool in studying the distribution, evolution, and physiological role of msDNA

Directory of Open Access Journals

PubMed Central

Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS

Author: Kim Jihye
Kim Sangsoo
Kwon Ji-sun
Nam Dougu
Publication venue: ?????????????????????
Publication date: 01/01/2012
Field of study

Gene set analysis (GSA) is useful in interpreting a genome-wide association study (GWAS) result in terms of biological mechanism. We compared the performance of two different GSA implementations that accept GWAS p-values of single nucleotide polymorphisms (SNPs) or gene-by-gene summaries thereof, GSA-SNP and i-GSEA4GWAS, under the same settings of inputs and parameters. GSA runs were made with two sets of p-values from a Korean type 2 diabetes mellitus GWAS study: 259,188 and 1,152,947 SNPs of the original and imputed genotype datasets, respectively. When Gene Ontology terms were used as gene sets, i-GSEA4GWAS produced 283 and 1,070 hits for the unimputed and imputed datasets, respectively. On the other hand, GSA-SNP reported 94 and 38 hits, respectively, for both datasets. Similar, but to a lesser degree, trends were observed with Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets as well. The huge number of hits by i-GSEA4GWAS for the imputed dataset was probably an artifact due to the scaling step in the algorithm. The decrease in hits by GSA-SNP for the imputed dataset may be due to the fact that it relies on Z-statistics, which is sensitive to variations in the background level of associations. Judicious evaluation of the GSA outcomes, perhaps based on multiple programs, is recommended.clos

CiteSeerX

Directory of Open Access Journals

PubMed Central

ScholarWorks@UNIST